Comparing Multilingual Comparable Articles Based On Opinions
نویسندگان
چکیده
Multilingual sentiment analysis attracts increased attention as the massive growth of multilingual web contents. This conducts to study opinions across different languages by comparing the underlying messages written by different people having different opinions. In this paper, we propose Sentiment based Comparability Measures (SCM) to compare opinions in multilingual comparable articles without translating source/target into the same language. This will allow media trackers (journalists) to automatically detect public opinion split across huge multilingual web contents. To develop SCM, we need either to get or to build parallel sentiment corpora. Because this kind of corpora are not available, we decided to build them. For that, we propose a new method to automatically label parallel corpora with sentiment classes. Then we use the extracted parallel sentiment corpora to develop multilingual sentiment analysis system. Experimental results show that, the proposed measure can capture differences in terms of opinions. The results also show that comparable articles variate in their objectivity and positivity.
منابع مشابه
Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced by Comparable Corpora
Automatically extracting parallel sentence pairs from the multilingual articles available on the Internet can address the data sparsity problem in building multilingual natural language processing applications, especially in machine translation. In this project, we have used an end-to-end siamese bidirectional recurrent neural network to generate parallel sentences from comparable multilingual ...
متن کاملDocument Categorization using Multilingual Associative Networks based on Wikipedia
Associative networks are a connectionist language model with the ability to categorize large sets of documents. In this research we combine monolingual associative networks based on Wikipedia to create a larger, multilingual associative network, using the cross-lingual connections between Wikipedia articles. We prove that such multilingual associative networks perform better than monolingual as...
متن کاملMINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora
In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news articles with similar news content are rich in Named Entity Transliteration Equivalents (NETEs). Our mining algorithm, MINT, uses a cross-language document similarity model to align multilingual news articles and then mines...
متن کاملCluster Labeling for Multilingual Scatter/Gather Using Comparable Corpora
Scatter/Gather systems are increasingly becoming useful in browsing document corpora. Usability of the present-day systems are restricted to monolingual corpora, and their methods for clustering and labeling do not easily extend to the multilingual setting, especially in the absence of dictionaries/machine translation. In this paper, we study the cluster labeling problem for multilingual corpor...
متن کاملComparing Speech Recognizers Derived from Mono- and Multilingual Grammars
This paper examines the performance of multilingual parameterized grammar rules on speech recognition. We present a performance comparison of two different types of Japanese and English grammar-based speech recognizers. One system is derived from monolingual grammar rules and the other from multilingual parameterized grammar rules. The latter one uses hence the same grammar rules for creation o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013